Age and Gender Prediction on Health Forum Data

نویسندگان

  • Prasha Shrestha
  • Nicolas Rey-Villamizar
  • Farig Sadeque
  • Ted Pedersen
  • Steven Bethard
  • Thamar Solorio
چکیده

Health support forums have become a rich source of data that can be used to improve health care outcomes. A user profile, including information such as age and gender, can support targeted analysis of forum data. But users might not always disclose their age and gender. It is desirable then to be able to automatically extract this information from users’ content. However, to the best of our knowledge there is no such resource for author profiling of health forum data. Here we present a large corpus, with close to 85,000 users, for profiling and also outline our approach and benchmark results to automatically detect a user’s age and gender from their forum posts. We use a mix of features from a user’s text as well as forum specific features to obtain accuracy well above the baseline, thus showing that both our dataset and our method are useful and valid.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of Maternal-Fetal Attachment Based on the Components of Gender Role in Pregnant women

Background and aim: Maternal concept is part of the feminine gender role. The important part of the maternal concept is the unique relationship experience between mother and child that begins with  maternal-fetal attachment(MFA) during pregnancy. The aim of this study is predict the MFA according to Gender role in pregnant women in Shiraz city. Methods:This descriptive correlational study was c...

متن کامل

A Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure

Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...

متن کامل

Enneagram Personality System as an Effective Model in Prediction of Risk of Cardiovascular Diseases: A Case-Control Study

Introduction: Studies on behavioral patterns and personality traits play a critical role in the prediction of healthy or unhealthy behaviors and identification of high-risk individuals for cardiovascular diseases (CVDs) in order to implement preventive strategies. This study aimed to compare personality types in individuals with and without CVD based on the enneagram of personality. Materials a...

متن کامل

Prediction of chronological age based on Demirjian dental age using robust ridge regression method

Introduction: Estimation of age has an important role in legal medicine, endocrine diseases and clinical dentistry. Correspondingly, evaluation of dental development stages is more valuable than tooth erosion. In this research, the modeling of calendar age has been done using new and rich statistical methods. Considerably, it can be considering as a practicable method in medical science that is...

متن کامل

Prediction of mortality in patients admitted to intensive care units, A comparison of three data mining techniques: a brief report.

Background: Early outcome prediction of hospitalized patients is critical because the intensivists are constantly striving to improve patients' survival by taking effective medical decisions about ill patients in Intensive Care Units (ICUs). Despite rapid progress in medical treatments and intensive care technology, the analysis of outcomes, including mortality prediction, has been a challenge ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016